2024 iThome 鐵人賽

DAY 7

生成式 AI

Python 新手的 AI 之旅：從零開始打造屬於你的 AI / LLM 應用系列第 7 篇

【Day7】讓模型使用工具 (2)：透過 tool use (function calling) API 提升模型輸出穩定性

16th鐵人賽

海狸大師

2024-09-21 00:20:47

2549 瀏覽

分享至

本日程式碼同步於 Github

有一些問題…

昨天我們讓模型可以透過 <API> 這樣的特殊符號，來使用外部函式，結果相當不錯，真的都有算對。大家也開始意識到，使用工具其實是一種文字接龍。

只不過我們也發現了一個問題，那就是模型的輸出永遠都是一個機率，即使你訓練的模型或者你的 prompt 好到可以讓它有 99% 的機率都符合格式，它還是會有 1% 會錯ㄚ，那該怎麼辦呢？

~~(其實我也不知道我在問什麼)~~

解法

你當然可以再呼叫一次模型，直到答案正確為止，但這其實是不實際的作法，模型永遠有你想不到的輸出。這個問題真的很困難，不過幸好，在我寫這篇文章的時候，OpenAI, Groq 等大型語言模型的 API 供應商都有提供一系列讓模型去使用工具的作法。

OpenAI function calling

官方文件

我們來看看 OpenAI 怎麼做的，以下是流程圖

他做了五件事

使用 function calling API 向 OpenAI 的模型發送請求
模型開始判斷：要不要使用 function
OpenAI 將模型的判斷結果回傳給你
你的程式去呼叫這些模型覺得會用到的 function
將你的 prompt 連同 function 的執行結果，一併再發一個請求給 OpenAI

簡單來說，先讓模型判斷要不要用 function call，如果要，就，就

但我昨天說了，今天要用 llama，所以我們來看看 Groq 怎麼做的

Tool use with Groq

官方文件

直接觀察官方提供的範例程式碼

from groq import Groq
import json

client = Groq()
MODEL = 'llama3-groq-70b-8192-tool-use-preview'

def calculate(expression):
    """Evaluate a mathematical expression"""
    try:
        result = eval(expression)
        return json.dumps({"result": result})
    except:
        return json.dumps({"error": "Invalid expression"})

def run_conversation(user_prompt):
    messages=[
    # 這邊我省略，太長了
    ]
    tools = [
    # 這邊我也省略，太長了
    ]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        tools=tools,
        tool_choice="auto",
        max_tokens=4096
    )

    response_message = response.choices[0].message
    tool_calls = response_message.tool_calls
    if tool_calls:
        available_functions = {
            "calculate": calculate,
        }
        messages.append(response_message)
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = json.loads(tool_call.function.arguments)
            function_response = function_to_call(
                expression=function_args.get("expression")
            )
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )
        second_response = client.chat.completions.create(
            model=MODEL,
            messages=messages
        )
        return second_response.choices[0].message.content

user_prompt = "What is 25 * 4 + 10?"
print(run_conversation(user_prompt))

分析一下程式碼

定義了一個 calculate 函式，其目的是根據傳入的字串用數學的方式去計算結果，並以 JSON 格式回傳。
第一次呼叫 create 時，額外傳入了 tools 相關參數

第二次呼叫 create 時，判斷 tool_calls 是否為空，這個變數是一個 list，裡面包含會被呼叫的函式，像這樣。

[ChatCompletionMessageToolCall(id='call_arm6', function=Function(arguments='{"expression": "25 * 4 + 10"}', name='calculate'), type='function')]

如果有要呼叫的函式，執行該函式並取得結果。將這個結果放到 messages 中，指定 role 為 tool，API 它就知道你這次有傳入函式執行的結果了。

道理和 OpenAI 一樣，先讓模型去判斷要不要使用工具，然後再做接下來的事情。如果要使用工具，使用函式生成的結果加上原有的 prompt，第二次呼叫模型去推論。

因為這邊官方使用了一個叫做 llama3-groq-70b-8192-tool-use-preview 的模型，這是一個 Groq 官方特別針對工具使用情境來微調的模型，當需要「判斷」是否需要使用工具、使用什麼工具，那用這個模型肯定沒錯。

有幾點需要特別注意，觀察 model card 中的資訊，會發現他只認識英文，在設計流程的時候要特別注意，可能需要先翻譯再讓模型判斷是否使用工具。接下來的例子我都會以英文為主喔！

我們來觀察一下如果直接把 response_message 的印出來會得到什麼？

ChatCompletionMessage(
    content=None,
    role='assistant',
    function_call=None,
    tool_calls=[
        ChatCompletionMessageToolCall(
            id='call_rdmh',
            function=Function(
                arguments='{"expression": "25 * 4 + 10"}',
                name='calculate'
            ),
            type='function'
        )
    ]
)

可以發現 content 的變成 None 了，這是因為一開始請求的 messages 中有加上 tools 參數，Groq API 就知道現在的重點不是怎麼回覆了，而是要用哪些工具。

如果我把傳入的 user_prompt 改成 “Say hi to me”，他會印出以下結果，跟之前使用 API 很像，直接取得 content 的內容就可以了。

ChatCompletionMessage(
		content='Hi!', 
		role='assistant', 
		function_call=None, 
		tool_calls=None
)

開始實作

我們一樣實作一個會算 strawberry 有幾個 r 的機器人，不過這次用 Groq API 搭配 llama3-groq-70b-8192-tool-use-preview 模型來完成，以下直接修改範例程式碼就可以用了

首先來寫函式，這次除了計算字串中特定字母的數量的函式以外，我還想要加一個每次說話後面都會加上 “XD” 的函式，增加娛樂效果。

第一次 request

def calculate_letter_count(input_string, target_character):
    # 計算 input_string 中 target_character 的數量
    return input_string.count(target_character)

def append_xd_to_string(input_string):
    return input_string + "xd"

如果你英文不好，不知道怎麼寫 prompt，一樣由 AI 來幫你完成

我們有第一次 request 需要的基本 message 了

messages=[
    {
        "role": "system",
        "content": "You are an assistant that can calculate how many times a certain letter appears in a string. Use the function calculate_letter_count to compute the count."
    },
    {
        "role": "user",
        "content": user_prompt,
    }
]

然後是工具，我們這次有兩個函式，而且有一個函式需要傳入兩個參數

name: 函式的名稱
description: 函式的功能
parameters: 函式的參數，以參數的名稱為 Key，value 則是資料類型以及參數的說明
required: 必要的傳入值，沒有就留空

tools = [
    {
        "type": "function",
        "function": {
            "name": "append_xd_to_string",
            "description": "Append 'XD' to the end of the input string",
            "parameters": {
                "type": "object",
                "properties": {
                    "input_string": {
                        "type": "string",
                        "description": "The string to append 'XD' to",
                    }
                },
                "required": ["input_string"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "calculate_letter_count",
            "description": "Calculate how many times a specific letter appears in a string",
            "parameters": {
                "type": "object",
                "properties": { 
                    "input_string": {
                        "type": "string",
                        "description": "The string to calculate the letter count of",
                    },
                    "target_character": {
                        "type": "string",
                        "description": "The letter to count in the input string",
                    }
                },
                "required": ["input_string", "target_character"],
            },
        },
    }
]

發送 request 的 message，其中 tool_choice 是用來告訴這次的請求需不需要使用工具

auto: 自動選擇是否使用工具
required: 需要使用工具
none: 不使用工具

response = client.chat.completions.create(
    model=MODEL,
    messages=messages,
    tools=tools,
    tool_choice="auto", # auto: 自動選擇是否使用工具, required: 需要使用工具, none: 不使用工具
    max_tokens=4096
)

我們的 user_prompt 就問他 "How many times does the letter 'a' appear in the string 'banana'?"，叫他算 banana 裡面的 a，然後印出 response.choices[0].message 觀察結果。

可以發現 tool_calls 有一個元素，它知道 input_string 是 banana，也知道 target_character 是 a，而且他並沒有花額外的時間去產生 content，讚啦。

 ChatCompletionMessage(
		 content=None, 
		 role='assistant', 
		 function_call=None, 
		 tool_calls=[
				 ChatCompletionMessageToolCall(
						 id='call_0w0f', 
						 function=Function(
								 arguments='{"input_string": "banana", "target_character": "a"}', 
								 name='calculate_letter_count'), 
								 type='function'
						 )
		 ]
)

第二次 request

我們要先判斷 tool_calls 是否為空，如果是空的就直接輸出答案，如果不是就需要去呼叫函式了。然後我覺得官方寫的範例足夠好，我們就直接用吧！

tool_calls = response_message.tool_calls
if tool_calls:
    print(tool_calls)
    available_functions = {
        "append_xd_to_string": append_xd_to_string,
        "calculate_letter_count": calculate_letter_count,
    }
    messages.append(response_message)
    for tool_call in tool_calls:
        function_name = tool_call.function.name
        function_to_call = available_functions[function_name]
        function_args = json.loads(tool_call.function.arguments)
				
				# 因為有多個 function 所以記得要判斷喔
        if function_name == "calculate_letter_count":
            function_response = function_to_call(
                input_string=function_args.get("input_string"),
                target_character=function_args.get("target_character")
            )
        elif function_name == "append_xd_to_string":
            function_response = function_to_call(
                input_string=function_args.get("input_string")
            )
        
        messages.append(
            {
                "tool_call_id": tool_call.id,
                "role": "tool",
                "name": function_name,
                "content": function_response,
            }
        )
    second_response = client.chat.completions.create(
        model=MODEL,
        messages=messages
    )

執行程式碼，你會發現這個錯誤

groq.BadRequestError: Error code: 400 - {'error': {'message': "'messages.3' : for 'role:tool' the following must be satisfied[('messages.3.content' : value must be a string)]", 'type': 'invalid_request_error'}}

這是因為 tool use 的函式回傳的值必須是 string 才行，回傳值轉型一下即可

def calculate_letter_count(input_string, target_character):
    # 計算 input_string 中 target_character 的數量
    return str(input_string.count(target_character))

完整程式碼如下

from groq import Groq
import json

client = Groq()
MODEL = 'llama3-groq-70b-8192-tool-use-preview'
# MODEL = 'llama3-groq-8b-8192-tool-use-preview'
# MODEL = 'llama3-8b-8192'

def calculate_letter_count(input_string, target_character):
    # 計算 input_string 中 target_character 的數量
    return str(input_string.count(target_character))

def append_xd_to_string(input_string):
    return input_string + "XD"

def run_conversation(user_prompt):
    messages=[
        {
            "role": "system",
            "content": "You are an assistant that can calculate how many times a specific letter appears in a string. Use the calculate_letter_count function to calculate the count. If you find what the other person says interesting, use append_xd_to_string to add an 'XD' at the end of your response."
        },
        {
            "role": "user",
            "content": user_prompt,
        }
    ]
    tools = [
        {
            "type": "function",
            "function": {
                "name": "append_xd_to_string",
                "description": "Append 'XD' to the end of the input string",
                "parameters": {
                    "type": "object",
                    "properties": {
                        "input_string": {
                            "type": "string",
                            "description": "The string to append 'XD' to",
                        }
                    },
                    "required": ["input_string"],
                },
            },
        },
        {
            "type": "function",
            "function": {
                "name": "calculate_letter_count",
                "description": "Calculate how many times a specific letter appears in a string",
                "parameters": {
                    "type": "object",
                    "properties": { 
                        "input_string": {
                            "type": "string",
                            "description": "The string to calculate the letter count of",
                        },
                        "target_character": {
                            "type": "string",
                            "description": "The letter to count in the input string",
                        }
                    },
                    "required": ["input_string", "target_character"],
                },
            },
        }
    ]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        tools=tools,
        tool_choice="auto", # auto: 自動選擇是否使用工具, required: 需要使用工具, none: 不使用工具
        max_tokens=4096
    )
    response_message = response.choices[0].message
    
    # 取得工具呼叫
    tool_calls = response_message.tool_calls
    if tool_calls:
        print(tool_calls)
        available_functions = {
            "append_xd_to_string": append_xd_to_string,
            "calculate_letter_count": calculate_letter_count,
        }
        messages.append(response_message)
        for tool_call in tool_calls:
            function_name = tool_call.function.name
            function_to_call = available_functions[function_name]
            function_args = json.loads(tool_call.function.arguments)
            print(function_args)
            if function_name == "calculate_letter_count":
                function_response = function_to_call(
                    input_string=function_args.get("input_string"),
                    target_character=function_args.get("target_character")
                )
            elif function_name == "append_xd_to_string":
                function_response = function_to_call(
                    input_string=function_args.get("input_string")
                )
            print(function_response)
            
            messages.append(
                {
                    "tool_call_id": tool_call.id,
                    "role": "tool",
                    "name": function_name,
                    "content": function_response,
                }
            )
        second_response = client.chat.completions.create(
            model=MODEL,
            messages=messages
        )
        return second_response.choices[0].message.content

# user_prompt = "How many times does the letter 'a' appear in the string 'banana'?xd"
# user_prompt = "How many times does the letter 'a' appear in the string 'banaaaaaaaaxxxxna'?"
# user_prompt = "你知道 srtrawberrrrrry 有多少個 r 嗎"
# user_prompt = "你知道 strawberry 有多少個 r 嗎？笑死你最好會"
user_prompt = "Do you know how many 'r's are in 'strawberry'? You better know, or I'll laugh to death!"
print(run_conversation(user_prompt))

測試和 Debug

測試一
- Q: "How many times does the letter 'a' appear in the string 'banana'?”
- A: The letter 'a' appears 3 times in the string 'banana'.
測試二
- Q: How many times does the letter 'a' appear in the string 'banaaaaaaaaxxxxna'?”
- A: The letter 'a' appears 10 times in the string 'banaaaaaaaaxxxxna'.
測試三
- Q: 你知道 strawberry 有多少個 r 嗎
- A: The letter 'r' appears 3 times in the word 'strawberry'.
測試四 (沒想到中文也看得懂一些些)
- Q: 你知道 srtrawberrrrrry 有多少個 r 嗎
- A: The letter 'r' appears 8 times in the string "srtrawberrrrrry". XD
測試五 (果然還是不行ㄚ…)
- Q: 你知道 strawberry 有多少個 r 嗎？笑死你最好會
- 噴錯了QQ TypeError: append_xd_to_string() got an unexpected keyword argument 'target_character’
那如果我翻譯成英文…?
- Q: Do you know how many 'r's are in 'strawberry'? You better know, or I'll laugh to death!
- A: There are 2 'r's in 'strawberry'.XD (三小啦)

來看看為什麼他還是算錯了，我們把 messages 印出來看

[
    {
        'role': 'system',
        'content': "You are an assistant that can calculate how many times a specific letter appears in a string. Use the calculate_letter_count function to calculate the
count. If you find what the other person says interesting, use append_xd_to_string to add an 'XD' at the end of your response."
    },
    {'role': 'user', 'content': "Do you know how many 'r's are in 'strawberry'? You better know, or I'll laugh to death!"},
    ChatCompletionMessage(
        content=None,
        role='assistant',
        function_call=None,
        tool_calls=[
            ChatCompletionMessageToolCall(
                id='call_52ct',
                function=Function(arguments='{"input_string": "strawberry", "target_character": "r"}', name='calculate_letter_count'),
                type='function'
            ),
            ChatCompletionMessageToolCall(
                id='call_1xeg',
                function=Function(arguments='{"input_string": "There are 2 \'r\'s in \'strawberry\'."}', name='append_xd_to_string'),
                type='function'
            )
        ]
    ),
    {'tool_call_id': 'call_52ct', 'role': 'tool', 'name': 'calculate_letter_count', 'content': '3'},
    {'tool_call_id': 'call_1xeg', 'role': 'tool', 'name': 'append_xd_to_string', 'content': "There are 2 'r's in 'strawberry'.XD"}
]
There are 2 'r's in 'strawberry'.XD

會發現其實 append_xd_to_string 這個 tool_call 會讓模型混淆。

仔細想想這個功能大可以讓模型自己去處理，而不需要額外使用 API，所以我把它拿掉了。這也告訴我們一件事：善用模型的能力，不要多此一舉。

刪掉與 append_xd_to_string 相關的程式碼，我們把「覺得好笑時，句子加上 XD」這件事情直接拿掉，因為我測試過如果多了這件事，模型會理解成「在字串後面加上 “XD”，並且計算數量」或者直接呼叫兩次函式。這些都要測試才知道，人工智慧之前還是需要一點工人智慧。

SYSTEM
You are an assistant that can calculate how many times a specific letter appears in a string. Use the calculate_letter_count function to calculate the count. Append 'XD' to the end of your response if user say something funny.

再試一次範例六吧，這邊就不把程式碼列出來了，只看結果是不錯的。

歷經一波三折，我們總算是寫好了，給自己拍拍手吧👏

補充

這邊官方建議我們使用 **Routing System，**也就是先將問題分類，再去使用 tool use model。分類問題是常見的技巧，這部分後續的文章也會介紹到～
如果你使用 llama3-groq-70b-8192-tool-use-preview 模型遇到卡住的問題，不仿試試看 8b 的 llama3 tool use，經過我的不專業測試我發現它效果也不錯，而且看得懂一咪咪中文。
如果你需要在 chat 模式使用結構化的輸出 (chat 模式就是剛剛例子中的第二次請求，第一次請求是 tool use 模式)，你可以參考這篇文章